Queueing with redundant requests: exact analysis

نویسندگان

  • Kristen Gardner
  • Samuel Zbarsky
  • Sherwin Doroudi
  • Mor Harchol-Balter
  • Esa Hyytiä
  • Alan Scheller-Wolf
چکیده

Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to run a request on multiple servers and wait for the first completion (discarding all remaining copies of the request). However, there is no exact analysis of systems with redundancy. This paper presents the first exact analysis of systems with redundancy. We allow for any number of classes of redundant requests, any number of classes of non-redundant requests, any degree of redundancy, and any number of heterogeneous servers. In all cases we derive the limiting distribution of the state of the system. In small (two or three server) systems, we derive simple forms for the distribution of response time of both the redundant classes and non-redundant classes, and we quantify the “gain” to redundant classes and “pain” to non-redundant classes caused by redundancy. We find some surprising results. First, the response time of a fully redundant class follows a simple exponential distribution and that of the non-redundant class follows a generalized hyperexponential. Second, fully redundant classes are “immune” to any pain caused by other classes becoming redundant. We also compare redundancy with other approaches for reducing latency, such as optimal probabilistic splitting of a class among servers (Opt-Split) and join-the-shortest-queue (JSQ) routing of a class. We find that, in many cases, redundancy outperforms JSQ and Opt-Split with respect to overall response time, making it an attractive solution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

S&X: Decoupling Server Slowdown (S) and Job Size (X) in Modeling Job Redundancy

Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to replicate a request so that it joins the queue at multiple servers. The request is considered complete as soon as any one copy of the request completes. Redundancy is beneficial because it allows us to overcome server-side variability – the fact that the server we choose might be temporarily...

متن کامل

Stochastic Bandwidth Packing Process: Stability Conditions via Lyapunov Function Technique

We consider the following stochastic bandwidth packing process: the requests for communication bandwidth of different sizes arrive at times t = 0, 1, 2, . . . and are allocated to a communication link using “largest first” rule. Each request takes a unit time to complete. The unallocated requests form queues. Coffman and Stolyar [6] introduced this system and posed the following question: under...

متن کامل

The MDS Queue: Analysing Latency Performance of Codes and Redundant Requests

In order to scale economically, data centers are increasingly evolving their data storage methods from the use of simple data replication to the use of more powerful erasure codes, which provide the same level of reliability as replication-based methods at a significantly lower storage cost. In particular, it is well known that MaximumDistance-Separable (MDS) codes, such as Reed-Solomon codes, ...

متن کامل

Performability Modelling of Distributed Systems using Layered Queueing Networks

Proliferation of large and complex fault-tolerant distributed systems in recent years has stimulated the combined modelling of performance and dependability of such systems. For large systems it may be very expensive to compute valid performance estimates to be used in the combined performability measures. This work considers two different classes of fault-tolerant client-server systems, in whi...

متن کامل

Heavy Tails in Queueing Systems: Impact of Parallelism on Tail Performance

In this paper we quantify the efficiency of parallelism in systems that are prone to failures and exhibit power law processing delays. We characterize the performance of two prototype schemes of parallelism, redundant and split, in terms of both the power law exponent and exact asymptotics of the delay distribution tail. We also develop the optimal splitting scheme which ensures that split alwa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Queueing Syst.

دوره 83  شماره 

صفحات  -

تاریخ انتشار 2016